Minimizing the disclosure risk of semantic correlations in document sanitization
نویسندگان
چکیده
0020-0255/$ see front matter 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ins.2013.06.042 ⇑ Corresponding author. Address: Departament d’Enginyeria Informàtica i Matemàtiques, Universitat Rovira i Virgili., Av., Països Catalans, 2 Tarragona, Spain. Tel.: +34 977 559657; fax: +34 977 559710. E-mail address: [email protected] (D. Sánchez). David Sánchez ⇑, Montserrat Batet, Alexandre Viejo
منابع مشابه
Utility-preserving sanitization of semantically correlated terms in textual documents
Traditionally, redaction has been the method chosen to mitigate the privacy issues related to the declassification of textual documents containing sensitive data. This process is based on removing sensitive words in the documents prior to their release and has the undesired side effect of severely reducing the utility of the content. Document sanitization is a recent alternative to redaction, w...
متن کاملData sanitization in association rule mining based on impact factor
Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...
متن کاملDetecting Term Relationships to Improve Textual Document Sanitization
Nowadays, the publication of textual documents provides critical benefits to scientific research and business scenarios where information analysis plays an essential role. Nevertheless, the possible existence of identifying or confidential data in this kind of documents motivates the use of measures to sanitize sensitive information before being published, while keeping the innocuous data unmod...
متن کاملDocument Sanitization: Measuring Search Engine Information Loss and Risk of Disclosure for the Wikileaks cables
In this paper we evaluate the effect of a document sanitization process on a set of information retrieval metrics, in order to measure information loss and risk of disclosure. As an example document set, we use a subset of the Wikileaks Cables, made up of documents relating to five key news items which were revealed by the cables. In order to sanitize the documents we have developed a semi-auto...
متن کاملAutomatic Declassification of Textual Documents by Generalizing Sensitive Terms
With the advent of internet, large numbers of text documents are published and shared every day . Each of these documents is a collection of vast amount of information. Publically sharing of some of this information may affect the privacy of the document, if they are confidential information. So before document publishing, sanitization operations are performed on the document for preserving the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Sci.
دوره 249 شماره
صفحات -
تاریخ انتشار 2013